I am a data assimilation research scientist and scientific software developer at the Center for Western Weather and Water Extremes (CW3E).
One of CW3E's core programs is known as the Atmospheric River Reconnaissance (AR-Recon) program.
AR-Recon is led by CW3E and the NWS/NCEP, with core partners including multiple academic institutions and:
I am a co-PI on projects in collaboration with the U.S. Airforce, leading efforts for the development and adoption of the novel Joint Effort for Data Assimilation Integration (JEDI) and Model for Prediction Across Scales (MPAS) framework.
While the JEDI Framework will be adopted for operations in the future, it is still currently in a state of rapid development and extensive testing.
However, there is currently a major gap in open source software for operating legacy data assimilation systems for benchmark studies.
Workflow Management is a concept that originated in the 1970’s to handle business process management… to manage complex collections of business processes that need to be carried out in a certain way with complex interdependencies and requirements…
…scientific workflows are driven by the scientific data that “flows” through them… usually triggered by the availability of some kind of input data, and a task’s result is usually… fed as input to another task in the workflow.
My own methodological framework of analyzing the data assimilation problem is of a statistical learning problem.
I need to run many simulations to study:
My re-forecasting workflows are also non-standard from the perspective of operational forecasting;
I also perform simulations on multiple HPC platforms with different system architectures, job schedulers and software stacks, so I need to keep my software as portable and system-agnostic as possible.
These demands have led me to develop an experimental end-to-end data assimilation cycling system in the GSI-WRF-MET stack using the Rocoto Workflow Manager.
Each of these tasks is called as a scripted job, with run-time settings defined by the cycling workflow.
E.g., the WRF driver script, wrf.sh, runs differently depending on whether it is:
Currently, experiments are organized in a case study / control flow hierarchy:
All associated settings are written to a static directory which is sourced by the Rocoto workflow;
MET forecast verification is a key aspect of this workflow used to objectively assess forecast skill.
We shift focus now to the MET verification tools in this workflow with the 2022 - 2023 AR Season as a highly applicable use-case for these codes.
This is a perfect example of where one needs to analyze data from multiple:
Batch processing all this data is a non-trivial task, as a workflow should:
This problem arises frequently with tuning the GSI-WRF-Cycling-Template.
This is an ongoing research-to-operation project at CW3E, to generate open source tools for both purposes.
This benefits from community interaction, especially with collaboration with the NRT and Verification Teams on on workflow and research tool development.
As such, this infrastructure project presents opportunities for wider collaboration where scientific dissemination may be restricted.
Folks who are interested in collaborating on these tools are encouraged to reach out,
A late colleague of mine from when I was a postdoc in Norway, Yongqi Gao, liked to reference a proverb that is applicable to open source development:
“If you want to go fast, go yourself. If you want to go far, go together”.